Goto

Collaborating Authors

 network complexity



Quantifying Emergence in Neural Networks: Insights from Pruning and Training Dynamics

AlShinaifi, Faisal, Almoaigel, Zeyad, Li, Johnny Jingze, Kuleib, Abdulla, Silva, Gabriel A.

arXiv.org Artificial Intelligence

Emergence, where complex behaviors develop from the interactions of simpler components within a network, plays a crucial role in enhancing neural network capabilities. We introduce a quantitative framework to measure emergence during the training process and examine its impact on network performance, particularly in relation to pruning and training dynamics. Our hypothesis posits that the degree of emergence--defined by the connectivity between active and inactive nodes--can predict the development of emergent behaviors in the network. Through experiments with feedforward and convolutional architectures on benchmark datasets, we demonstrate that higher emergence correlates with improved trainability and performance. We further explore the relationship between network complexity and the loss landscape, suggesting that higher emergence indicates a greater concentration of local minima and a more rugged loss landscape. Pruning, which reduces network complexity by removing redundant nodes and connections, is shown to enhance training efficiency and convergence speed, though it may lead to a reduction in final accuracy. These findings provide new insights into the interplay between emergence, complexity, and performance in neural networks, offering valuable implications for the design and optimization of more efficient architectures.


Unexpected Benefits of Self-Modeling in Neural Systems

Premakumar, Vickram N., Vaiana, Michael, Pop, Florin, Rosenblatt, Judd, de Lucena, Diogo Schwerz, Ziman, Kirsten, Graziano, Michael S. A.

arXiv.org Artificial Intelligence

Self-models have been a topic of great interest for decades in studies of human cognition and more recently in machine learning. Yet what benefits do self-models confer? Here we show that when artificial networks learn to predict their internal states as an auxiliary task, they change in a fundamental way. To better perform the self-model task, the network learns to make itself simpler, more regularized, more parameter-efficient, and therefore more amenable to being predictively modeled. To test the hypothesis of self-regularizing through self-modeling, we used a range of network architectures performing three classification tasks across two modalities. In all cases, adding self-modeling caused a significant reduction in network complexity. The reduction was observed in two ways. First, the distribution of weights was narrower when self-modeling was present. Second, a measure of network complexity, the real log canonical threshold (RLCT), was smaller when self-modeling was present. Not only were measures of complexity reduced, but the reduction became more pronounced as greater training weight was placed on the auxiliary task of self-modeling. These results strongly support the hypothesis that self-modeling is more than simply a network learning to predict itself. The learning has a restructuring effect, reducing complexity and increasing parameter efficiency. This self-regularization may help explain some of the benefits of self-models reported in recent machine learning literature, as well as the adaptive value of self-models to biological systems. In particular, these findings may shed light on the possible interaction between the ability to model oneself and the ability to be more easily modeled by others in a social or cooperative context.


A Novel Algorithm for Community Detection in Networks using Rough Sets and Consensus Clustering

Grass-Boada, Darian H., González-Montesino, Leandro, Armañanzas, Rubén

arXiv.org Artificial Intelligence

Complex networks, such as those in social, biological, and technological systems, often present challenges to the task of community detection. Our research introduces a novel rough clustering based consensus community framework (RC-CCD) for effective structure identification of network communities. The RC-CCD method employs rough set theory to handle uncertainties within data and utilizes a consensus clustering approach to aggregate multiple clustering results, enhancing the reliability and accuracy of community detection. This integration allows the RC-CCD to effectively manage overlapping communities, which are often present in complex networks. This approach excels at detecting overlapping communities, offering a detailed and accurate representation of network structures. Comprehensive testing on benchmark networks generated by the Lancichinetti-Fortunato-Radicchi method showcased the strength and adaptability of the new proposal to varying node degrees and community sizes. Cross-comparisons of RC-CCD versus other well known detection algorithms outcomes highlighted its stability and adaptability.


Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks

Wang, Puyu, Lei, Yunwen, Wang, Di, Ying, Yiming, Zhou, Ding-Xuan

arXiv.org Machine Learning

Recently, significant progress has been made in understanding the generalization of neural networks (NNs) trained by gradient descent (GD) using the algorithmic stability approach. However, most of the existing research has focused on one-hidden-layer NNs and has not addressed the impact of different network scaling parameters. In this paper, we greatly extend the previous work \cite{lei2022stability,richards2021stability} by conducting a comprehensive stability and generalization analysis of GD for multi-layer NNs. For two-layer NNs, our results are established under general network scaling parameters, relaxing previous conditions. In the case of three-layer NNs, our technical contribution lies in demonstrating its nearly co-coercive property by utilizing a novel induction strategy that thoroughly explores the effects of over-parameterization. As a direct application of our general findings, we derive the excess risk rate of $O(1/\sqrt{n})$ for GD algorithms in both two-layer and three-layer NNs. This sheds light on sufficient or necessary conditions for under-parameterized and over-parameterized NNs trained by GD to attain the desired risk rate of $O(1/\sqrt{n})$. Moreover, we demonstrate that as the scaling parameter increases or the network complexity decreases, less over-parameterization is required for GD to achieve the desired error rates. Additionally, under a low-noise condition, we obtain a fast risk rate of $O(1/n)$ for GD in both two-layer and three-layer NNs.


Error convergence and engineering-guided hyperparameter search of PINNs: towards optimized I-FENN performance

Pantidis, Panos, Eldababy, Habiba, Tagle, Christopher Miguel, Mobasher, Mostafa E.

arXiv.org Artificial Intelligence

In our recently proposed Integrated Finite Element Neural Network (I-FENN) framework (Pantidis and Mobasher, 2023) we showcased how PINNs can be deployed on a finite element-level basis to swiftly approximate a state variable of interest, and we applied it in the context of non-local gradient-enhanced damage mechanics. In this paper, we enhance the rigour and performance of I-FENN by focusing on two crucial aspects of its PINN component: a) the error convergence analysis and b) the hyperparameter-performance relationship. Guided by the available theoretical formulations in the field, we introduce a systematic numerical approach based on a novel set of holistic performance metrics to answer both objectives. In the first objective, we explore in detail the convergence of the PINN training error and the global error against the network size and the training sample size. We demonstrate a consistent converging behavior of the two error types for any investigated combination of network complexity, dataset size and choice of hyperparameters, which empirically proves the conformance of the PINN setup and implementation to the available convergence theories. In the second objective, we establish an a-priori knowledge of the hyperparameters which favor higher predictive accuracy, lower computational effort, and the least chances of arriving at trivial solutions. The analysis leads to several outcomes that contribute to the better performance of I-FENN, and fills a long-standing gap in the PINN literature with regards to the numerical convergence of the network errors while accounting for commonly used optimizers (Adam and L-BFGS). The proposed analysis method can be directly extended to other ML applications in science and engineering. The code and data utilized in the analysis are posted publicly to aid the reproduction and extension of this research.


Aruba rolls out new AIOps capabilities

#artificialintelligence

Network modernization is a key component of digital transformation initiatives for organizations looking to achieve better business outcomes. With that in mind, Aruba has announced its new Aruba Edge Services Platform with AIOps capabilities designed to reduce the time IT professionals spend on manual tasks such as network troubleshooting, performance tuning and Zero Trust/SASE security enforcement. As part of Aruba's growing family of AIOps solutions, these new capabilities aim to supplement overtaxed IT teams as they grapple with increasing network complexity and the rapid growth of IoT. For the first time, AIOps can be utilized for not just network troubleshooting but also performance optimization and critical security controls, Aruba said. With the growth of hybrid work, new user engagement models and challenges resulting from the Great Resignation and widening skills gaps, IT teams must find ways to achieve greater efficiencies and do away with time-intensive manual processes, the company said.


Connectivity-informed Drainage Network Generation using Deep Convolution Generative Adversarial Networks

Kim, Sung Eun, Seo, Yongwon, Hwang, Junshik, Yoon, Hongkyu, Lee, Jonghyun

arXiv.org Machine Learning

Stochastic network modeling is often limited by high computational costs to generate a large number of networks enough for meaningful statistical evaluation. In this study, Deep Convolutional Generative Adversarial Networks (DCGANs) were applied to quickly reproduce drainage networks from the already generated network samples without repetitive long modeling of the stochastic network model, Gibb's model. In particular, we developed a novel connectivity-informed method that converts the drainage network images to the directional information of flow on each node of the drainage network, and then transform it into multiple binary layers where the connectivity constraints between nodes in the drainage network are stored. DCGANs trained with three different types of training samples were compared; 1) original drainage network images, 2) their corresponding directional information only, and 3) the connectivity-informed directional information. Comparison of generated images demonstrated that the novel connectivity-informed method outperformed the other two methods by training DCGANs more effectively and better reproducing accurate drainage networks due to its compact representation of the network complexity and connectivity. This work highlights that DCGANs can be applicable for high contrast images common in earth and material sciences where the network, fractures, and other high contrast features are important.


Dark Web's Doppelgängers Aim to Dupe Antifraud Systems

Communications of the ACM

Deep within the encrypted bowels of the dark Web, beyond the reach of regular search engines, hackers and cybercriminals are brazenly trading a new breed of digital fakes. Yet unlike AI-generated deepfake audio and video--which embarrass the likes of politicians and celebrities by making them appear to say or do things they never would--this new breed of imitators is aimed squarely at relieving us of our hard-earned cash. Comprising highly detailed fake user profiles known as digital doppelgängers, these entities convincingly mimic numerous facets of our digital device IDs, alongside many of our tell-tale online behaviors when conducting transactions and e-shopping. The result: credit card fraudsters can use these doppelgängers to attempt to evade the machine-learning-based anomaly-detecting antifraud measures upon which banks and payments service providers have come to rely. It is proving to be big criminal business: many tens of thousands of doppelgängers are now being sold on the dark Web.


Is AI the Antidote to Network Complexity? - SDxCentral

#artificialintelligence

Dreams of a future built on 5G networks and powered by IoT dominated the conversation at conferences in 2019. But for all these grand visions there's a problem: How do you manage networks with millions of cell sites connecting billions of IoT devices? According to some the answer is better visibility enabled by artificial intelligence (AI). While most of these technologies are still years from reaching maturity, that's not stopping companies in the performance analytics space like EXFO and Vitria from investing big in machine learning (ML) and AI. According to Ken Gold, director of test, monitoring, and analytics solutions at EXFO, the implicit complexity associated with massive 5G IoT deployments is only going to make identifying and resolving network anomalies all the more challenging.